Entry Name:  "TTU-Lenin-MC2"

VAST Challenge 2014
Mini-Challenge 2

 

 

Team Members:

Lenin Mookiah, Tennessee Tech University, lmookiah42@students.tntech.edu     PRIMARY
Prof. William (Bill) Eberle, Tennessee Tech University, WEberle@tntech.edu
Prof. Larry Holder, Washington State University, holder@wsu.edu

Student Team:YES

 

Analytic Tools Used:

Graph Based Anomaly Detection (GBAD), developed by the Big Data and Knowledge Discovery Group at Tennessee Tech University.

 

Approximately how many hours were spent working on this submission in total?

80 hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2014 is complete? YES

 

Video:

https://www.youtube.com/watch?v=nQUhNUF0YQA

 

MC2-Lenin-VAST

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1Describe common daily routines for GAStech employees. What does a day in the life of a typical GAStech employee look like?


GBAD Introduction:

The GBAD graph-based anomaly detection tool suite discovers structural anomalies in data represented as a graph. The Minimum Description Length (MDL) principle is used to identify the normative pattern that minimizes the number of bits needed to describe the input graph after being compressed by the normative pattern.In this experiment, this would be the most likely normal (daily) activity of a given employee.

The GBAD probability algorithm uses the MDL evaluation technique to discover the normative pattern in a graph, but instead of examining all instances for similarity, this approach examines all extensions to the normative substructure (pattern), looking for extensions with the lowest probability. In other words, the algorithm examines the probability of extensions to the normative pattern to determine if there is an instance that includes edges and vertices that are probabilistically less likely than other possible extensions.


Summary
Typical employee of GASTech visit "Guy's Gyros" for meals, and a couple of times visits to coffee shops (ex: Brew've been served), Kalami Kafenion and Hippokampos. They also additionally visit "Ouzeri Elian" (bar) and sometimes grocery stores (ex: general grocer, kronos mart) and occasionally they visit Gelatogalore.

Below are the normative patterns for 4 of the GASTech employees (Figure MC2.1.1-MC2.1.4).

Figure MC2.1.1: Engineering - Drill Technician. Name: Tempestad Brand


Figure MC2.1.2: Engineering - Drill Site Manager. Name: Onda Marin


Figure MC2.1.3: Engineering - Engineer. Name: Balas Felix


Figure MC2.1.4: Security - Perimeter Control. Name: Osvaldo Hennie



 

MC2.2Identify up to twelve unusual events or patterns that you see in the data. If you identify more than twelve patterns during your analysis, focus your answer on the patterns you consider to be most important for further investigation to help find the missing staff members. For each pattern or event you identify, describe:

a.       What is the pattern or event you observe?

 

b.      There is an event occurring in the path of "Rist Way" (near Chostus Hotel) and around "Spetson Park". These suspects particularly spend time passing through "niovis st" and "exadakitiou way" in "Rist Way" and at some streets (mentioned in c. below) around "Spetson Park".

 

c.       Who is involved?

 

d.      Below are the suspects' car assigned (Car ID), Department - Designation, and Name.
Car ID: 11 -Engineering - Hydraulic Technician (Name: Cazar Gustav).
Car ID: 9 -Engineering - Drill Technician (Name: Calzas Axel).
Car ID: 3 -Engineering - Engineer (Name: Balas Felix).
Car ID: 16 -Security - Perimeter Control (Name: Vann Isia).
Car ID: 21 -Security - Perimeter Control (Name: Osvaldo Hennie).
Car ID: 26 -Engineering - Drill Site Manager (Name: Onda Marin).
Car ID: 14 - Engineering - Engineering Group Manager (Name: Dedos Lidelse).
Car ID: 34 -Security - Perimeter Control (Name: Vann Edvard).
Car ID: 33 -Engineering - Drill Technician (Name: Tempestad Brand).
Car ID: 24 -Security - Perimeter Control (Name: Mies Minke).

 

e.      What locations are involved?

 

f.        "niovis st", "exadakitiou way" , "n estos st", "n utmana st", "n ketallinias st", "n ithakis st", "n oddisseos st"

 

g.       When does the pattern or event take place?

 

h.      Pattern happens between the evening of 01/10 and 01/11. For each event that contributes to the pattern, the pattern (graph) is shown, and a table is shown below each event that gives the information of movements (events) for each suspect.

 

 

Event 1: Figure MC2.2.1:


Event 1: Security - Perimeter Control (Name: Osvaldo Hennie):

On 01/11 in the afternoon, spent around 6+ hours at "n utmana st 3600 3698" (near "Spetson Park") passing via "niovis st" and "exadakitiou way".

Event 1: Table MC2.2.1: Car ID: 21

DateTime

Location

Comments

1/10/14 17:30

n souliou st 1500 1522

1/11/14 3:25

niovis st 3100 3198

1/11/14 3:28

exadakitiou way

1/11/14 3:31

n utmana st 3600 3698

night spent

1/11/14 11:03

exadakitiou way


Event 2: Figure MC2.2.2:


Event 2: Car ID: 16 - Security - Perimeter Control (Name: Vann Isia):

On 01/10 late night passing through "exadakitiou way", spent 3 hours mid-night between "n utmana st 3700 3798".

Event 2: Table MC2.2.2: Car ID: 16

DateTime

Location

Comments

1/10/14 23:05

exadakitiou way

1/10/14 23:21

n utmana st 3700 3798

1/11/14 3:23

n utmana st 3700 3798

3-hours spent


Event 3: Figure MC2.2.3:


Event 3: Car ID: 33 - Engineering - Drill Technician (Name: Tempestad Brand):

On 01/10 by passing through "exadakitiou way" and "niovis st 2700 2798" late night 4 hours spent at "n ketallinias st 4600 4650" (near "Spetson Park"). Midnight spent time between "niovis st" and "exadakitiou way".

Event 3: Table MC2.2.3: Car ID: 33

DateTime

Location

Comments

1/10/14 19:30

n ketallinias st 4600 4650

1/10/14 23:37

n ketallinias st 4600 4650

around 4-hours spent

1/10/14 23:41

n estos st 3600 3698

1/10/14 23:43

exadakitiou way

1/10/14 23:45

niovis st 2700 2798

1/10/14 23:46

n sirrakou st 601 2499

1/11/14 19:38

niovis st 2700 2798

next day available in same location (may late-night spent)


Event 4: Figure MC2.2.4:


Event 4: Car ID: 34 - Security - Perimeter Control (Name: Vann Edvard):

Passing through "niovis st" and "exadakitiou way". On 01/10 spent 4 hours time between "exadakitiou way" and "n estos st 3600 3698" (near Spetson Park).

Event 4: Table MC2.2.4: Car ID: 34

DateTime

Location

Comments

1/10/14 17:29

n tangno st 800 898

1/11/14 14:12

n kritis rd 2700 2898

1/11/14 14:13

niovis st 3100 3198

1/11/14 14:13

niovis st 3100 3198

1/11/14 14:15

exadakitiou way

1/11/14 14:15

exadakitiou way

1/11/14 18:11

n estos st 3600 3698

near Spetson Park ; 4-hours


Event 5: Figure MC2.2.5:


Event 5: Car ID: 26 - Engineering - Drill Site Manager (Name: Onda Marin):

Passing through "niovis st" and "exadakitiou way", on 01/10 mid-night spent approximately 4 hours between "exadakitiou way" and "n estos st 3600 3698" (near Spetson Park).

Event 5: Table MC2.2.5: Car ID: 26

DateTime

Location

Comments

1/10/14 20:10

n ketallinias st 4600 4650

1/11/14 0:22

n ketallinias st 4600 4650

Mid-night around 4-hours spent


Event 6: Figure MC2.2.6:


Event 6: Car ID: 11 - Engineering - Hydraulic Technician (Name: Cazar Gustav):

Spent around 5 hours at "n ketallinias st 4600 4650" (near Spetson Park).

Event 6: Table MC2.2.6: Car ID: 11

DateTime

Location

Comments

1/10/14 18:45

n ketallinias st 4600 4650

1/10/14 23:23

n ketallinias st 4600 4650

near mid-night spent around 4-hours


Event 7: Figure MC2.2.7:


Event 7: Car ID: 24 -Security - Perimeter Control (Name: Mies Minke):

Nighttime spent passing through "niovis st" and "exadakitiou way". Around 3 hours spent between "n ithakis st 3700 3848" and "n oddisseos st 3600 3698".

Event 7: Table MC2.2.7: Car ID: 24

DateTime

Location

Comments

1/10/14 11:18

n scarkeme st 2300 2598

1/11/14 12:55

niovis st 2700 2798

1/11/14 13:39

n ithakis st 3700 3848

near Spetson Park

1/11/14 16:15

n oddisseos st 3600 3698

About 3 hours spent; near Spetson Park

1/11/14 16:18

exadakitiou way

1/11/14 16:20

niovis st 2900 2998


Event 8: Figure MC2.2.8:


Event 8: Car ID: 3 - Engineering - Engineer (Name: Balas Felix):

Spent mid-night 5 hours at "n ketallinias st 4600 4650" (near "Spetson Park").

Event 8: Table MC2.2.8: Car ID: 3

DateTime

Location

Comments

1/10/14 19:03

n ketallinias st 4600 4650

1/11/14 0:29

n ketallinias st 4600 4650

Mid-night spent

1/11/14 00:29:28

n omirou st 4700 4798


Event 9: Figure MC2.2.9:


Event 9: Car ID: 9 - Engineering - Drill Technician (Name: Calzas Axel):

On 01/10 near mid-night spent 4 hour at "n ketallinias st 4600 4650"(near Spetson Park).

Event 9: Table MC2.2.9: Car ID: 9

DateTime

Location

Comments

1/10/14 19:11

n ketallinias st 4600 4650

1/10/14 19:12

n ketallinias st 4600 4650

1/10/14 23:55

n ketallinias st 4600 4650

1/10/14 23:55

n ithakis st 4500 4598

1/11/14 19:34

n ithakis st 3700 3848

may be late-night stay


Event 10: Figure MC2.2.10:


Event 10: Car ID: 14 - Engineering - Engineering Group Manager (Name: Dedos Lidelse):

On day 01/10, late night 4 hour spent at "n ketallinias st" and passing through "niovis st".

Event 10: Table MC2.2.10: Car ID: 14

DateTime

Location

Comments

1/10/14 18:59

n ketallinias st 4600 4650

1/10/14 23:30

n ketallinias st 4600 4650

around 5-hours spent

1/10/14 23:38

niovis st 2900 2998

1/10/14 23:38

niovis st 2700 2798

1/12/14 12:31

niovis st 2900 2998

0/11 data missing


i.         Why is this pattern or event significant?

 

j.        Pattern is significant because at least 8 employees moving around locations of "Spetson Park" and "Chostus Hotel" which are away from office (or) regular eating place.

 

k.       What is your level of confidence about this pattern or event?  Why?

 

l.         Very few employees visit these locations, which do not appear to be related to work, eating, or shopping. These suspicious patterns of movement are rare in the data.

 

 

 

 

 

MC2.3Like most datasets, the data you were provided is imperfect, with possible issues such as missing data, conflicting data, data of varying resolutions, outliers, or other kinds of confusing data.  Considering MC2 data is primarily spatiotemporal, describe how you identified and addressed the uncertainties and conflicts inherent in this data to reach your conclusions in questions MC2.1 and MC2.2. 


By running a simple script that calculates statistics on data such as distinct number of days, number of GPS (car) movement recorded for each employee etc. By analyzing these statistics, we find missing data for a few employees as described below.

1) Badging Office (Herrero Kanon)

01/06 data missing (only one GPS location recorded). On 01/07, fewer GPS recordings than usual. For this employee, only morning and evening data recorded. Similarly for "Frente Vira", data of 01/15 start from 12:18, where one would assume that the employee should have been seen to at least travel from home to work.

2) Perimeter Control (Osvaldo Hennie)

Since the most important suspicious event we look for is between 01/10 and 01/11, despite we have missing data for the day 01/12 for this employee, we conclude the employee is suspicious.

3) Engineering Group Manager (Dedos Lidelse)

01/11 is weekend. 01/11 data is missing for this employee, which may be due to employee not moved at all for the day or missing (imperfect) data as mentioned in the questionnaire.

4) IT Group Manager (Bergen Linnea)

01/12 is weekend. The data is missing for this employee which may be due to employee not moved at all for the day or missing (imperfect) data as mentioned in the questionnaire.

5)

Outliers
For Vasco-Pais Willem, GBAD outputs "n utmana st", which is a suspected location, but the employee visits this location regularly. Hence we conclude the employee as non-suspect. Similarly for Strum Orhan, GBAD outputs "n mikonou st" on 01/19 as anomaly, which is closer to "Spetson Park", but this location, is not in intersection of our suspicious events of between 01/10 and 01/11, hence concluded as non-suspect.


Figure MC2.3.1: Vasco-Pais Willem on 01/11.


Figure MC2.3.2: Strum Orhan on 01/19.